label dependence
From Multi-label Learning to Cross-Domain Transfer: A Model-Agnostic Approach
In multi-label learning, a particular case of multi-task learning where a single data point is associated with multiple target labels, it was widely assumed in the literature that, to obtain best accuracy, the dependence among the labels should be explicitly modeled. This premise led to a proliferation of methods offering techniques to learn and predict labels together, for example where the prediction for one label influences predictions for other labels. Even though it is now acknowledged that in many contexts a model of dependence is not required for optimal performance, such models continue to outperform independent models in some of those very contexts, suggesting alternative explanations for their performance beyond label dependence, which the literature is only recently beginning to unravel. Leveraging and extending recent discoveries, we turn the original premise of multi-label learning on its head, and approach the problem of joint-modeling specifically under the absence of any measurable dependence among task labels; for example, when task labels come from separate problem domains. We shift insights from this study towards building an approach for transfer learning that challenges the long-held assumption that transferability of tasks comes from measurements of similarity between the source and target domains or models. This allows us to design and test a method for transfer learning, which is model driven rather than purely data driven, and furthermore it is black box and model-agnostic (any base model class can be considered). We show that essentially we can create task-dependence based on source-model capacity. The results we obtain have important implications and provide clear directions for future work, both in the areas of multi-label and transfer learning.
- North America > United States > New York > New York County > New York City (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (3 more...)
- Research Report > New Finding (0.88)
- Research Report > Promising Solution (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.78)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Spolaôr
Feature Selection plays an important role in machine learning and data mining, and it is often applied as a data pre-processing step. This task can speed up learning algorithms and sometimes improve their performance. In multi-label learning, label dependence is considered another aspect that can contribute to improve learning performance. A replicable and wide systematic review performed by us corroborates this idea. Based on this information, it is believed that considering label dependence during feature selection can lead to better learning performance. The hypothesis of this work is that multi-label feature selection algorithms that consider label dependence will perform better than the ones that disregard it. To this end, we propose multi-label feature selection algorithms that take into account label relations. These algorithms were experimentally compared to the standard approach for feature selection, showing good performance in terms of feature reduction and predictability of the classifiers built using the selected features.
Classifier Chains: A Review and Perspectives
Read, Jesse, Pfahringer, Bernhard, Holmes, Geoffrey, Frank, Eibe
The family of methods collectively known as classifier chains has become a popular approach to multi-label learning problems. This approach involves chaining together off-the-shelf binary classifiers in a directed structure, such that individual label predictions become features for other classifiers. Such methods have proved flexible and effective and have obtained state-of-the-art empirical performance across many datasets and multi-label evaluation metrics. This performance led to further studies of the underlying mechanism and efficacy, and investigation into how it could be improved. In the recent decade, numerous studies have explored the theoretical underpinnings of classifier chains, and many improvements have been made to the training and inference procedures, such that this method remains among the best options for multi-label learning. Given this past and ongoing interest, which covers a broad range of applications and research themes, the goal of this work is to provide a review of classifier chains, a survey of the techniques and extensions provided in the literature, as well as perspectives for this approach in the domain of multi-label classification in the future. We conclude positively, with a number of recommendations for researchers and practitioners, as well as outlining key issues for future research.
- North America > United States > New York > New York County > New York City (0.14)
- Asia > China (0.04)
- Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
- (6 more...)
Classifier Chains: A Review and Perspectives
Read, Jesse, Pfahringer, Bernhard, Holmes, Geoff, Frank, Eibe
The family of methods collectively known as classifier chains has become a popular approach to multi-label learning problems. This approach involves linking together off-the-shelf binary classifiers in a chain structure, such that class label predictions become features for other classifiers. Such methods have proved flexible and effective and have obtained state-of-the-art empirical performance across many datasets and multi-label evaluation metrics. This performance led to further studies of how exactly it works, and how it could be improved, and in the recent decade numerous studies have explored classifier chains mechanisms on a theoretical level, and many improvements have been made to the training and inference procedures, such that this method remains among the state-of-the-art options for multi-label learning. Given this past and ongoing interest, which covers a broad range of applications and research themes, the goal of this work is to provide a review of classifier chains, a survey of the techniques and extensions provided in the literature, as well as perspectives for this approach in the domain of multi-label classification in the future. We conclude positively, with a number of recommendations for researchers and practitioners, as well as outlining a number of areas for future research.
- North America > United States > New York > New York County > New York City (0.14)
- Oceania > New Zealand > North Island > Waikato (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
Rectifying Classifier Chains for Multi-Label Classification
Senge, Robin, del Coz, Juan José, Hüllermeier, Eyke
Classifier chains have recently been proposed as an appealing method for tackling the multi-label classification task. In addition to several empirical studies showing its state-of-the-art performance, especially when being used in its ensemble variant, there are also some first results on theoretical properties of classifier chains. Continuing along this line, we analyze the influence of a potential pitfall of the learning process, namely the discrepancy between the feature spaces used in training and testing: While true class labels are used as supplementary attributes for training the binary models along the chain, the same models need to rely on estimations of these labels at prediction time. We elucidate under which circumstances the attribute noise thus created can affect the overall prediction performance. As a result of our findings, we propose two modifications of classifier chains that are meant to overcome this problem. Experimentally, we show that our variants are indeed able to produce better results in cases where the original chaining process is likely to fail.
- Research Report > Experimental Study (0.95)
- Research Report > New Finding (0.89)
Multi-label Classification using Labels as Hidden Nodes
Competitive methods for multi-label classification typically invest in learning labels together. To do so in a beneficial way, analysis of label dependence is often seen as a fundamental step, separate and prior to constructing a classifier. Some methods invest up to hundreds of times more computational effort in building dependency models, than training the final classifier itself. We extend some recent discussion in the literature and provide a deeper analysis, namely, developing the view that label dependence is often introduced by an inadequate base classifier, rather than being inherent to the data or underlying concept; showing how even an exhaustive analysis of label dependence may not lead to an optimal classification structure. Viewing labels as additional features (a transformation of the input), we create neural-network inspired novel methods that remove the emphasis of a prior dependency structure. Our methods have an important advantage particular to multi-label data: they leverage labels to create effective units in middle layers, rather than learning these units from scratch in an unsupervised fashion with gradient-based methods. Results are promising. The methods we propose perform competitively, and also have very important qualities of scalability.
- North America > United States (0.04)
- Europe > France (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)
Multi-Label Structure Learning with Ising Model Selection
Goncalves, Andre R. (University of Campinas) | Zuben, Fernando J. Von (University of Campinas) | Banerjee, Arindam (University of Minnesota, Twin Cities)
A common way of attacking multi-label classification problems is by splitting it into a set of binary classification problems, then solving each problem independently using traditional single-label methods. Nevertheless, by learning classifiers separately the information about the relationship between labels tends to be neglected. Built on recent advances in structure learning in Ising Markov Random Fields (I-MRF), we propose a multi-label classification algorithm that explicitly estimate and incorporate label dependence into the classifiers learning process by means of a sparse convex multi-task learning formulation.Extensive experiments considering several existing multi-label algorithms indicate that the proposed method, while conceptually simple, outperforms the contenders in several datasets and performance metrics. Besides that, the conditional dependence graph encoded in the I-MRF provides a useful information that can be used in a posterior investigation regarding the reasons behind the relationship between labels.
- South America > Brazil > São Paulo > Campinas (0.04)
- North America > United States > Minnesota (0.04)
- North America > United States > Arizona (0.04)
- (2 more...)
Feature Selection for Multi-Label Learning
Spolaôr, Newton (University of São Paulo) | Monard, Maria Carolina (University of São Paulo) | Lee, Huei Diana (State University of West Paraná)
Feature Selection plays an important role in machine learning and data mining, and it is often applied as a data pre-processing step. This task can speed up learning algorithms and sometimes improve their performance. In multi-label learning, label dependence is considered another aspect that can contribute to improve learning performance. A replicable and wide systematic review performed by us corroborates this idea. Based on this information, it is believed that considering label dependence during feature selection can lead to better learning performance. The hypothesis of this work is that multi-label feature selection algorithms that consider label dependence will perform better than the ones that disregard it. To this end, we propose multi-label feature selection algorithms that take into account label relations. These algorithms were experimentally compared to the standard approach for feature selection, showing good performance in terms of feature reduction and predictability of the classifiers built using the selected features.
- South America > Brazil > São Paulo (0.06)
- North America > United States (0.05)
Scalable Multi-Output Label Prediction: From Classifier Chains to Classifier Trellises
Read, J., Martino, L., Olmos, P., Luengo, D.
Multi-output inference tasks, such as multi-label classification, have become increasingly important in recent years. A popular method for multi-label classification is classifier chains, in which the predictions of individual classifiers are cascaded along a chain, thus taking into account inter-label dependencies and improving the overall performance. Several varieties of classifier chain methods have been introduced, and many of them perform very competitively across a wide range of benchmark datasets. However, scalability limitations become apparent on larger datasets when modeling a fully-cascaded chain. In particular, the methods' strategies for discovering and modeling a good chain structure constitutes a mayor computational bottleneck. In this paper, we present the classifier trellis (CT) method for scalable multi-label classification. We compare CT with several recently proposed classifier chain methods to show that it occupies an important niche: it is highly competitive on standard multi-label problems, yet it can also scale up to thousands or even tens of thousands of labels.
- Europe > Spain > Galicia > Madrid (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)